A New Supervised Over-Sampling Algorithm with Application to Protein-Nucleotide Binding Residue Prediction

نویسندگان

  • Jun Hu
  • Xue He
  • Dong-Jun Yu
  • Xi-Bei Yang
  • Jing-Yu Yang
  • Hong-Bin Shen
  • Yang Zhang
چکیده

Protein-nucleotide interactions are ubiquitous in a wide variety of biological processes. Accurately identifying interaction residues solely from protein sequences is useful for both protein function annotation and drug design, especially in the post-genomic era, as large volumes of protein data have not been functionally annotated. Protein-nucleotide binding residue prediction is a typical imbalanced learning problem, where binding residues are extremely fewer in number than non-binding residues. Alleviating the severity of class imbalance has been demonstrated to be a promising means of improving the prediction performance of a machine-learning-based predictor for class imbalance problems. However, little attention has been paid to the negative impact of class imbalance on protein-nucleotide binding residue prediction. In this study, we propose a new supervised over-sampling algorithm that synthesizes additional minority class samples to address class imbalance. The experimental results from protein-nucleotide interaction datasets demonstrate that the proposed supervised over-sampling algorithm can relieve the severity of class imbalance and help to improve prediction performance. Based on the proposed over-sampling algorithm, a predictor, called TargetSOS, is implemented for protein-nucleotide binding residue prediction. Cross-validation tests and independent validation tests demonstrate the effectiveness of TargetSOS. The web-server and datasets used in this study are freely available at http://www.csbio.sjtu.edu.cn/bioinf/TargetSOS/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative modelling of 3D-structure of Geobacter sp. M21 (a metal reducing bacteria) Mn-Fe superoxide dismutase and its binding properties with bisphenol-A, aminotriazole and ethylene-diurea

Superoxide dismutase play important roles in iron-respiratory bacteria such as Geobacteraceae as an antioxidant defense, and probably an effective enzyme of electron transfer network. Regarding the application of iron-respiratory bacteria in environmental biotechnology particularly biodegradation and bioremediation, understanding the mechanism of inhibition/induction of superoxide dismutase by ...

متن کامل

Predicting Protein-Peptide Binding Affinity by Learning Peptide-Peptide Distance Functions

Many important cellular response mechanisms are activated when a peptide binds to an appropriate receptor. In the immune system, the recognition of pathogen peptides begins when they bind to cell membrane Major Histocompatibility Complexes (MHCs). MHC proteins then carry these peptides to the cell surface in order to allow the activation of cytotoxic T-cells. The MHC binding cleft is highly pol...

متن کامل

Effect of tillage and residue management on productivity of soybean and physico-chemical properties of soil in soybean–wheat cropping system

A microplot experiment was conducted in soybean–wheat cropping system at New Delhi during 2010-11 and 2011-12 to study the effect of continuous or cyclic tillage, viz., conventional tillage (CT) and zero-tillage (ZT) and residue management of either soybean (SR) and/or wheat (WR) on yield performance and soil physico-chemical properties. The experiment was laid out in randomized block desi...

متن کامل

Application of Linear Regression and Artificial NeuralNetwork for Broiler Chicken Growth Performance Prediction

This study was conducted to investigate the prediction of growth performance using linear regression and artificial neural network (ANN) in broiler chicken. Artificial neural networks (ANNs) are powerful tools for modeling systems in a wide range of applications. The ANN model with a back propagation algorithm successfully learned the relationship between the inputs of metabolizable energy (kca...

متن کامل

Application of radioimmunoassay technique for determination of antigen concentration in different cells with a new monoclonal antibody [Persian]

Introduction: Binding a monoclonal antibody to tumor associated antigens is an effective method for cancer therapy because these agents can specifically target malignant cells, in fact monoclonal antibodies are effective agents for diagnosis, grading and treatment of different kinds of cancers. Methods: In this research, a new monoclonal antibody against colon cancer cells was prepared an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014